Discounted stochastic games poorly approximate undiscounted ones

نویسنده

  • Peter Bro Miltersen
چکیده

The purpose of this note is to summarize recent results [6, 4, 5] on stochastic games by the author and his collaborators. These results have appeared or will appear in proceedings of computer science conferences. The intended reader of this note has an interest in finite stochastic games, but no particular interest in computation, that is, the computational aspects of the results obtained have been weeded out. We consider two-player zero-sum finite (but infinite duration) stochastic games G with N positions and at most m actions available to each of the two players in each position. The reward to Player I when Player I plays i and Player II plays j in position k is denoted aij . Transition probabilites are denoted p kl ij . We assume stopping probabilitites are 0, i.e., for all k, i, j we have ∑ l p kl ij = 1. To be able to state our results as simply as possible, we shall also assume throughout that for all k, l, i, j, we have aij , p kl ij ∈ {0, 1}. In particular, we assume deterministic dynamics of nature and non-negative payoffs. By G we denote the game with limiting average (undiscounted) payoffs, i.e, payoff lim inft→∞( ∑t−1 i=0 ri)/t to Player I, where ri is the reward collected by Player I at stage i. By Gλ we denote the game with payoffs discounted with a discount factor of 1− λ, i.e., with payoff λ ∑∞ i=0(1− λ)ri to Player I. By GT we denote the finite game with T stages and payoff ( ∑T−1 i=0 ri)/T to Player I. We shall be interested in the special case where G is a recursive game in the sense of Everett [3] . In a recursive game, all non-zero rewards occur at absorbing states with only one action available to each player (”terminal states”). When G is recursive, we let GT denote the finite game with payoff rT−1 to Player I (i.e., the payoff is the reward Player I collects at the last stage of play). Note that when rewards are non-negative, val(GT ) ≤ val(GT ). The seminal result of Mertens and Neyman [7] states:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exact Algorithms for Solving Stochastic Games

Shapley’s discounted stochastic games, Everett’s recursive games and Gillette’s undiscounted stochastic games are classical models of game theory describing two-player zero-sum games of potentially infinite duration. We describe algorithms for exactly solving these games. When the number of positions of the game is constant, our algorithms run in polynomial time.

متن کامل

Discounted approximations of undiscounted stochastic games and Markov decision processes are already poor in the almost deterministic case

It is shown that the discount factor needed to solve an undiscounted mean payoff stochastic game to optimality is exponentially close to 1, even in oneplayer games with a single random node and polynomially bounded rewards and transition probabilities. On the other hand, for the class of the so-called irreducible games with perfect information and a constant number of random nodes, we obtain a ...

متن کامل

The Cooperative Solution of Stochastic Games

Building on the work of Nash, Harsanyi, and Shapley, we define a cooperative solution for strategic games that takes account of both the competitive and the cooperative aspects of such games. We prove existence in the general (NTU) case and uniqueness in the TU case. Our main result is an extension of the definition and the existence and uniqueness theorems to stochastic games discounted or und...

متن کامل

Finite-step Algorithms for Single-controller and Perfect Information Stochastic Games

After a brief survey of iterative algorithms for general stochastic games, we concentrate on finite-step algorithms for two special classes of stochastic games. They are Single-Controller Stochastic Games and Perfect Information Stochastic Games. In the case of single-controller games, the transition probabilities depend on the actions of the same player in all states. In perfect information st...

متن کامل

Every stochastic game with perfect information admits a canonical form

We consider discounted and undiscounted stochastic games with perfect information in the form of a natural BWR-model with positions of three types: VB Black, VW White, VR Random. These BWR-games lie in the complexity class NP∩CoNP and contain the well-known cyclic games (when VR is empty) and Markov decision processes (when VB or VW is empty). We show that the BWR-model is polynomial-time equiv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011